Efficient Implementation of Software Release Consistency on Asymmetric Distributed Shared Memory

نویسندگان

  • Junpei Niwa
  • Tatsushi Inagaki
  • Takashi Matsumoto
  • Kei Hiraki
چکیده

The shared memory system can reduce the cost of programming effort in the distributed memory systems. On distributed systems such as networks of computers, it is necessary to provide the shared memory model by software. We have proposed an “Asymmetric Distributed SharedMemory: ADSM”, that provides high-speed-update distributed memory access by software. The compiler for the ADSM translates the code reading to shared memory into a single load instruction and translates the code writing to shared memory into a sequence of instructions, which means instructions managing consistency are separated from the store instruction. The instructions managing consistency are explicitly inserted after the store instruction. According to this property, we can perform various optimizations: (1)We can reduce the number of instructions managing consistency. (2)We can select any protocol per page. We have implemented prototypes of the compiler and the runtime system for the ADSM on amulticomputer Fujitsu AP1000+. We use 3 of the SPLASH-2 kernel benchmarks. We evaluate the effectiveness of the optimization on the ADSM scheme by preliminary experiments. ANY OTHER IDENTIFYING INFORMATION OF THIS REPORT Submitted to SC ’97 DISTRIBUTION STATEMENT This technical report is available ONLY via anonymous FTP from ftp.is.s.u-tokyo.ac.jp (directory /pub/tech-reports). SUPPLEMENTARY NOTES REPORT DATE June 11,1997 TOTAL NO. OF PAGES 13 WRITTEN LANGUAGE English NO. OF REFERENCES 13 DEPARTMENT OF INFORMATION SCIENCE Faculty of Science, University of Tokyo 7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan Efficient Implementation of Software Release Consistency on Asymmetric Distributed Shared Memory Junpei Niwa, Tatsushi Inagaki, Takashi Matsumoto, and Kei Hiraki June 11, 1997 Abstract Programming on distributed memory systems is very difficult because programmers must use message passing library directly. The shared memory system can reduce the cost of programming effort in the distributed systems. On distributed systems such as networks of computers, it is necessary to provide the shared memory model by software. We have proposed an “Asymmetric Distributed Shared Memory: ADSM”, that provides highspeed-update distributed memory access by software. As for the read, the ADSM prepares the pagesize cache and exploits the virtual shared memory mechanism. As for the write, the ADSM executes a sequence of instructions for managing consistency after the corresponding store instruction. In this paper we describe the optimization methods on the ADSM. Because the compiler inserts a sequence of instructions for managing consistency after the store instruction, the ADSM supports protocol selection per page. In order to execute write detection and write collection efficiently, we define write history which consist of the corresponding address and the corresponding size. By using this scheme, we have implemented LRC protocol, AURC protocol emulated by software and Hybrid protocol of LRC and AURC on the ADSM. In order to generate the efficient instructions for managing consistency, we implement the pointer analysis and classify the store as local, must-shared, possibly-shared. When the store is must-shared, the compiler inserts the the instructions for managing consistency. When the store is possibly-shared, the compiler inserts the instructions which check dynamically the address and manage consistency in the case where the address is shared location. We describe the algorithm to combine sequences of instructions for managing consistency. First the compiler generate the codes using the pointer analysis. For each loopnest the algorithm progress from the innerloop to the outerloop, checking whether the instructions for managing consistency exist. If instructions for managing consistency are loop-invariant or consist of the induction variables and there are no synchronization points, they are combined and hoisted out to the outerloop. We have implemented prototypes of the compiler and the runtime system for the ADSM on a multicomputer Fujitsu AP1000+. This time we experiment by emulating the virtual memory mechanism because the operating system of the AP1000+ does not support user-level page fault handler and the optimization methods are independent of the virtual memory mechanism. We use 3 of the SPLASH-2 kernel benchmarks using 8 nodes. We evaluate the effectiveness of the optimization on the ADSM scheme by experiments. We confirm that our optimization algorithm reduces the execution time significantly. In LU-Contig, the optimized execution time is reduced to the 8 % of the unoptimized execution time.Programming on distributed memory systems is very difficult because programmers must use message passing library directly. The shared memory system can reduce the cost of programming effort in the distributed systems. On distributed systems such as networks of computers, it is necessary to provide the shared memory model by software. We have proposed an “Asymmetric Distributed Shared Memory: ADSM”, that provides highspeed-update distributed memory access by software. As for the read, the ADSM prepares the pagesize cache and exploits the virtual shared memory mechanism. As for the write, the ADSM executes a sequence of instructions for managing consistency after the corresponding store instruction. In this paper we describe the optimization methods on the ADSM. Because the compiler inserts a sequence of instructions for managing consistency after the store instruction, the ADSM supports protocol selection per page. In order to execute write detection and write collection efficiently, we define write history which consist of the corresponding address and the corresponding size. By using this scheme, we have implemented LRC protocol, AURC protocol emulated by software and Hybrid protocol of LRC and AURC on the ADSM. In order to generate the efficient instructions for managing consistency, we implement the pointer analysis and classify the store as local, must-shared, possibly-shared. When the store is must-shared, the compiler inserts the the instructions for managing consistency. When the store is possibly-shared, the compiler inserts the instructions which check dynamically the address and manage consistency in the case where the address is shared location. We describe the algorithm to combine sequences of instructions for managing consistency. First the compiler generate the codes using the pointer analysis. For each loopnest the algorithm progress from the innerloop to the outerloop, checking whether the instructions for managing consistency exist. If instructions for managing consistency are loop-invariant or consist of the induction variables and there are no synchronization points, they are combined and hoisted out to the outerloop. We have implemented prototypes of the compiler and the runtime system for the ADSM on a multicomputer Fujitsu AP1000+. This time we experiment by emulating the virtual memory mechanism because the operating system of the AP1000+ does not support user-level page fault handler and the optimization methods are independent of the virtual memory mechanism. We use 3 of the SPLASH-2 kernel benchmarks using 8 nodes. We evaluate the effectiveness of the optimization on the ADSM scheme by experiments. We confirm that our optimization algorithm reduces the execution time significantly. In LU-Contig, the optimized execution time is reduced to the 8 % of the unoptimized execution time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

E cient Implementation of Software Release Consistency on Asymmetric Distributed Shared Memory

We have proposed an \Asymmetric Distributed Shared Memory: ADSM", that provides users with efcient shared memory model. The ADSM is a hybrid system that needs not only the operating system support but also the compiler support. The ADSM executes a load instruction as the shared-read with the assistance of virtual memory mechanism. As for the shared-write, the ADSM executes a sequence of instruc...

متن کامل

Streaming memory consistency for efficient MPSoC design

Multiprocessor systems-on-chip (MPSoC) with distributed shared memory and caches are flexible when it comes to inter-processor communication but require an efficient memory consistency and cache coherency solution. In this paper we present a novel consistency model, streaming consistency, for the streaming domain in which tasks communicate through circular buffers. The model allows more reorder...

متن کامل

Aggressive Release Consistency for Software Distributed Shared Memory

As a software-based distributed shared memory (DSM) system is especially sensitive to the traffic amount over the network, we propose in this paper a new software DSM model which postpones the enforcement of data coherence at the time of the first shared memory access after an acquire, instead of at the time of the acquire like the lazy release consistency (LRC) model, leading to an aggressive ...

متن کامل

Modeling Relaxed Memory Consistency Protocols

This paper presents a modeling approach based on deterministic and stochastic Petri nets (DSPN's) for analyzing memory consistency protocols for multiprocessors with Distributed Shared Memory (DSM). DSPN's are a numerically solvable modeling formalism with a graphical representation. The modeling approach addresses in particular the performance degradation due to the amount of message exchange,...

متن کامل

SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters

Multithreaded parallel system with software Distributed Shared Memory (DSM) is an attractive direction in cluster computing. In these systems, distributing workloads and keeping the shared memory operations efficient are critical issues. Distributed Cilk (Cilk 5.1) is a multithreaded runtime system for SMP clusters with the support of divide-and-conquer programming paradigm. However, there is n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997